AITopics | parameter initialization strategy

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Neural Information Processing SystemsDec-26-2025, 02:12:13 GMT

Residual networks (ResNet) and weight normalization play an important role in various deep learning applications. However, parameter initialization strategies have not been studied previously for weight normalized networks and, in practice, initialization methods designed for un-normalized networks are used as a proxy. Similarly, initialization for ResNets have also been studied for un-normalized networks and often under simplified settings ignoring the shortcut connection. To address these issues, we propose a novel parameter initialization strategy that avoids explosion/vanishment of information across layers for weight normalized networks with and without residual connections. The proposed strategy is based on a theoretical analysis using mean field approximation. We run over 2,500 experiments and evaluate our proposal on image datasets showing that the proposed initialization outperforms existing initialization methods in terms of generalization performance, robustness to hyper-parameter values and variance between seeds, especially when networks get deeper in which case existing methods fail to even start training. Finally, we show that using our initialization in conjunction with learning rate warmup is able to reduce the gap between the performance of weight normalized and batch normalized networks.

initialization, name change, robust initialization, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Neural Information Processing SystemsOct-11-2024, 03:48:21 GMT

Residual networks (ResNet) and weight normalization play an important role in various deep learning applications. However, parameter initialization strategies have not been studied previously for weight normalized networks and, in practice, initialization methods designed for un-normalized networks are used as a proxy. Similarly, initialization for ResNets have also been studied for un-normalized networks and often under simplified settings ignoring the shortcut connection. To address these issues, we propose a novel parameter initialization strategy that avoids explosion/vanishment of information across layers for weight normalized networks with and without residual connections. The proposed strategy is based on a theoretical analysis using mean field approximation. We run over 2,500 experiments and evaluate our proposal on image datasets showing that the proposed initialization outperforms existing initialization methods in terms of generalization performance, robustness to hyper-parameter values and variance between seeds, especially when networks get deeper in which case existing methods fail to even start training.

parameter initialization strategy, robust initialization, weightnorm & resnet, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback

Uncertainty Distribution Assessment of Jiles-Atherton Parameter Estimation for Inrush Current Studies

Ugarte-Valdivielso, Jone, Aizpurua, Jose I., Barrenetxea-Iñarra, Manex

arXiv.org Artificial IntelligenceMay-17-2024

Transformers are one of the key assets in AC distribution grids and renewable power integration. During transformer energization inrush currents appear, which lead to transformer degradation and can cause grid instability events. These inrush currents are a consequence of the transformer's magnetic core saturation during its connection to the grid. Transformer cores are normally modelled by the Jiles-Atherton (JA) model which contains five parameters. These parameters can be estimated by metaheuristic-based search algorithms. The parameter initialization of these algorithms plays an important role in the algorithm convergence. The most popular strategy used for JA parameter initialization is a random uniform distribution. However, techniques such as parameter initialization by Probability Density Functions (PDFs) have shown to improve accuracy over random methods. In this context, this research work presents a framework to assess the impact of different parameter initialization strategies on the performance of the JA parameter estimation for inrush current studies. Depending on available data and expert knowledge, uncertainty levels are modelled with different PDFs. Moreover, three different metaheuristic-search algorithms are employed on two different core materials and their accuracy and computational time are compared. Results show an improvement in the accuracy and computational time of the metaheuristic-based algorithms when PDF parameter initialization is used.

algorithm, initialization, parameter initialization, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TPWRD.2024.3398790

2405.11011

Country:

Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Energy > Power Industry (0.66)
Energy > Renewable (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)

Add feedback

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Arpit, Devansh, Campos, Víctor, Bengio, Yoshua

Neural Information Processing SystemsMar-19-2020, 01:02:46 GMT

Residual networks (ResNet) and weight normalization play an important role in various deep learning applications. However, parameter initialization strategies have not been studied previously for weight normalized networks and, in practice, initialization methods designed for un-normalized networks are used as a proxy. Similarly, initialization for ResNets have also been studied for un-normalized networks and often under simplified settings ignoring the shortcut connection. To address these issues, we propose a novel parameter initialization strategy that avoids explosion/vanishment of information across layers for weight normalized networks with and without residual connections. The proposed strategy is based on a theoretical analysis using mean field approximation.

parameter initialization strategy, robust initialization, weightnorm & resnet, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback

Filters

Collaborating Authors

parameter initialization strategy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Uncertainty Distribution Assessment of Jiles-Atherton Parameter Estimation for Inrush Current Studies

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets